{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/klavis-ai/klavis/blob/main/examples/gemini/Use_Klavis_with_Gemini.ipynb)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "# Gemini + Klavis AI Integration\n",
    "\n",
    "This tutorial demonstrates how to use Google's Gemini with function calling with Klavis MCP (Model Context Protocol) servers.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "## Prerequisites\n",
    "\n",
    "- **Gemini API key** - Get at [ai.google.dev](https://ai.google.dev/)\n",
    "- **Klavis API key** - Get at [klavis.ai](https://klavis.ai/)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Note: you may need to restart the kernel to use updated packages.\n"
     ]
    }
   ],
   "source": [
    "# Install the required packages\n",
    "%pip install -qU google-generativeai klavis"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import webbrowser\n",
    "from google import genai\n",
    "from google.genai import types\n",
    "from klavis import Klavis\n",
    "from klavis.types import McpServerName, ToolFormat\n",
    "\n",
    "# Set environment variables (you can also use .env file)\n",
    "os.environ[\"GEMINI_API_KEY\"] = \"YOUR_GEMINI_API_KEY\"  # Replace with your actual Gemini API key\n",
    "os.environ[\"KLAVIS_API_KEY\"] = \"YOUR_KLAVIS_API_KEY\"  # Replace with your actual Klavis API key\n",
    "\n",
    "# Initialize clients\n",
    "gemini_client = genai.Client(api_key=os.getenv(\"GEMINI_API_KEY\"))\n",
    "klavis_client = Klavis(api_key=os.getenv(\"KLAVIS_API_KEY\"))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "## Case Study 1 : Gemini + YouTube MCP Server\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "#### Step 1 - Create YouTube MCP Server using Klavis\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\ud83d\udd17 YouTube MCP server created at: https://youtube-mcp-server.klavis.ai/mcp/?instance_id=1910fcd2-426a-4e67-afbe-39234e044db9, and the instance id is 1910fcd2-426a-4e67-afbe-39234e044db9\n"
     ]
    }
   ],
   "source": [
    "youtube_mcp_instance = klavis_client.mcp_server.create_server_instance(\n",
    "    server_name=McpServerName.YOUTUBE,\n",
    "    user_id=\"1234\",\n",
    ")\n",
    "\n",
    "print(f\"\ud83d\udd17 YouTube MCP server created at: {youtube_mcp_instance.server_url}, and the instance id is {youtube_mcp_instance.instance_id}\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "#### Step 2 - Create general method to use MCP Server with Gemini\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "def gemini_with_mcp_server(mcp_server_url: str, user_query: str):\n",
    "    # Get tools from MCP server\n",
    "    mcp_server_tools = klavis_client.mcp_server.list_tools(\n",
    "        server_url=mcp_server_url,\n",
    "        format=ToolFormat.GEMINI,\n",
    "    )\n",
    "    print(f\"\ud83d\udce6 Available tools: {mcp_server_tools}\")\n",
    "    \n",
    "    # Prepare conversation contents\n",
    "    contents = [types.Content(role=\"user\", parts=[types.Part(text=user_query)])]\n",
    "    \n",
    "    # Generate response with function calling\n",
    "    response = gemini_client.models.generate_content(\n",
    "        model='gemini-1.5-pro',\n",
    "        contents=contents,\n",
    "        config=types.GenerateContentConfig(tools=mcp_server_tools.tools)\n",
    "    )\n",
    "    \n",
    "    if response.candidates and response.candidates[0].content.parts:\n",
    "        contents.append(response.candidates[0].content)\n",
    "        \n",
    "        # Check if there are function calls to execute\n",
    "        has_function_calls = False\n",
    "        for part in response.candidates[0].content.parts:\n",
    "            if hasattr(part, 'function_call') and part.function_call:\n",
    "                has_function_calls = True\n",
    "                print(f\"\ud83d\udd27 Calling function: {part.function_call.name}\")\n",
    "                \n",
    "                try:\n",
    "                    # Execute tool call via Klavis\n",
    "                    function_result = klavis_client.mcp_server.call_tools(\n",
    "                        server_url=mcp_server_url,\n",
    "                        tool_name=part.function_call.name,\n",
    "                        tool_args=dict(part.function_call.args),\n",
    "                    )\n",
    "                    \n",
    "                    # Create function response in the proper format\n",
    "                    function_response = {'result': function_result.result}\n",
    "                    \n",
    "                except Exception as e:\n",
    "                    print(f\"Function call error: {e}\")\n",
    "                    function_response = {'error': str(e)}\n",
    "                \n",
    "                # Add function response to conversation\n",
    "                function_response_part = types.Part.from_function_response(\n",
    "                    name=part.function_call.name,\n",
    "                    response=function_response,\n",
    "                )\n",
    "                function_response_content = types.Content(\n",
    "                    role='tool', \n",
    "                    parts=[function_response_part]\n",
    "                )\n",
    "                contents.append(function_response_content)\n",
    "        \n",
    "        if has_function_calls:\n",
    "            # Generate final response after function calls\n",
    "            final_response = gemini_client.models.generate_content(\n",
    "                model='gemini-1.5-pro',\n",
    "                contents=contents,\n",
    "                config=types.GenerateContentConfig(tools=mcp_server_tools.tools)\n",
    "            )\n",
    "            return final_response.text\n",
    "        else:\n",
    "            # No function calls, return original response\n",
    "            return response.text\n",
    "    else:\n",
    "        return \"No response generated.\"\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "#### Step 3 - Summarize your favorite video!\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\ud83d\udce6 Available tools: success=True tools=[{'function_declarations': [{'name': 'get_youtube_video_transcript', 'description': \"Retrieve the transcript or video details for a given YouTube video. The 'start' time in the transcript is formatted as MM:SS or HH:MM:SS.\", 'parameters': {'type': 'object', 'properties': {'url': {'type': 'string', 'description': 'The URL of the YouTube video to retrieve the transcript/subtitles for. (e.g. https://www.youtube.com/watch?v=dQw4w9WgXcQ)', 'items': None}}, 'required': ['url']}}]}] format=<ToolFormat.GEMINI: 'gemini'> error=None\n",
      "\ud83d\udd27 Calling function: get_youtube_video_transcript\n",
      "Andrej Karpathy, former director of AI at Tesla, discusses the evolution of software and the impact of large language models (LLMs).\n",
      "\n",
      "**Software 1.0, 2.0, and 3.0:** Karpathy describes three paradigms of software development: Software 1.0 involves writing explicit code for computers, while Software 2.0 uses neural networks, where the code is the network's weights, learned through optimization. Software 3.0, the latest paradigm, utilizes natural language prompts to program LLMs.  He draws parallels between Hugging Face and GitHub, highlighting how Hugging Face acts as a repository for Software 2.0, similar to GitHub's role for Software 1.0.\n",
      "\n",
      "**LLMs as Operating Systems:** Karpathy argues that LLMs are not merely utilities like electricity, but complex operating systems. He likens LLM labs (OpenAI, Gemini, etc.) to utility providers, building the infrastructure (the grid), and charging for metered access. He also points out that LLMs share characteristics with fabs, requiring substantial capital expenditure and possessing rapidly evolving technology trees.  He draws parallels to the operating system landscape, with closed-source providers like Windows and Mac OS and open-source alternatives like Linux, comparing them to the current state of LLMs, where a few companies control access and open-source models like Llama are emerging.  He emphasizes the importance of fluency in all three software paradigms for those entering the industry, due to the unique advantages and disadvantages of each.  Karpathy envisions LLMs as the CPU, context windows as memory, and prompts as instructions in this new operating system. He further notes that LLM apps, like Cursor and Perplexity, resemble traditional apps running on different operating systems, and he suggests we are in a stage similar to the 1960s of computing, with centralized, time-shared access being the norm due to cost.\n",
      "\n",
      "**The Psychology of LLMs:**  Karpathy describes LLMs as \"stochastic simulations of people,\" possessing encyclopedic knowledge and memory, similar to an autistic savant. However, they also exhibit cognitive deficits, including hallucinations, jagged intelligence (superhuman in some areas, subhuman in others), and a lack of self-knowledge. He points to their susceptibility to prompt injection and data leaks as security concerns. He recommends watching the movies \"Rainman\", \"Memento\", and \"51st Dates\" to better understand the memory and knowledge retention characteristics of LLMs.  These limitations require carefully crafted prompts and a balanced approach to utilizing their strengths while mitigating their weaknesses.\n",
      "\n",
      "**Opportunities and Challenges:** Karpathy advocates for \"partial autonomy apps\" that leverage LLMs while maintaining human oversight.  He emphasizes the importance of fast generation-verification loops, aided by GUIs and visual representations.  He cautions against overreliance on AI agents, stressing the need to keep them \"on the leash\" due to their fallibility.  He draws an analogy to his experience at Tesla, where initial success with self-driving cars in 2013 led to overly optimistic predictions. The lesson, he argues, is that software development, like driving, is complex, and complete autonomy will take time. He recommends focusing on partial autonomy products with user-friendly interfaces and adjustable autonomy sliders, allowing for increased automation as the technology matures.  He encourages developers to consider how to give LLMs access to the information humans can see and the actions they can take, and to allow humans to effectively supervise their work.  He recommends thinking of this as building Iron Man suits (augmentation) rather than Iron Man robots (full autonomy) for now.\n",
      "\n",
      "**Vibe Coding and Agent-Based Development:** Karpathy discusses the democratization of programming through natural language, citing the popularity of \"vibe coding,\" where users, even without formal coding experience, can describe what they want to create, allowing LLMs to generate code based on the prompt. This natural language interface opens up software development to a much wider audience. He showcases his own experience with vibe coding an iOS app and the \"Menu Genen\" app. He highlights that generating code is now the easy part, while the difficulty lies in deployment and Dev Ops tasks. He also suggests the use of simple, LLM-friendly formats for documentation, such as markdown or protocols like the Model Context Protocol, to facilitate better interaction with these models.  He mentions using natural language prompts to utilize documentation like the Manim animation library. He proposes ideas like `llm.txt` (analogous to `robots.txt`) for websites to communicate directly with LLMs.  He also points out that GUIs are not agent-friendly and much current documentation is for humans, not LLMs, requiring a shift in how interfaces and information are presented.\n",
      "\n",
      "Karpathy concludes with an optimistic outlook on the future of software development, emphasizing the need for both human-driven coding and agent-based development, and the exciting possibilities that arise from the combination of human creativity and AI capabilities.\n",
      "\n"
     ]
    }
   ],
   "source": [
    "YOUTUBE_VIDEO_URL = \"https://www.youtube.com/watch?v=LCEmiRjPEtQ\"  # pick a video you like!\n",
    "\n",
    "result = gemini_with_mcp_server(\n",
    "    mcp_server_url=youtube_mcp_instance.server_url, \n",
    "    user_query=f\"Please provide a complete summary of this YouTube video with timestamp: {YOUTUBE_VIDEO_URL}\"\n",
    ")\n",
    "\n",
    "print(result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "\u2705 Great! You've successfully created an AI agent that uses Gemini's function calling with Klavis MCP servers to summarize YouTube videos!\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "## Case Study 2 : Gemini + Gmail MCP Server (OAuth needed)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Create Gmail MCP server instance\n",
    "gmail_mcp_server = klavis_client.mcp_server.create_server_instance(\n",
    "    server_name=McpServerName.GMAIL,\n",
    "    user_id=\"1234\",\n",
    ")\n",
    "\n",
    "webbrowser.open(gmail_mcp_server.oauth_url)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\ud83d\udce6 Available tools: success=True tools=[{'function_declarations': [{'name': 'send_email', 'description': 'Sends a new email', 'parameters': {'type': 'object', 'properties': {'to': {'type': 'array', 'description': 'List of recipient email addresses', 'items': {'type': 'string'}}, 'subject': {'type': 'string', 'description': 'Email subject', 'items': None}, 'body': {'type': 'string', 'description': 'Email body content (used for text/plain or when htmlBody not provided)', 'items': None}, 'htmlBody': {'type': 'string', 'description': 'HTML version of the email body', 'items': None}, 'mimeType': {'type': 'string', 'description': 'Email content type', 'items': None}, 'cc': {'type': 'array', 'description': 'List of CC recipients', 'items': {'type': 'string'}}, 'bcc': {'type': 'array', 'description': 'List of BCC recipients', 'items': {'type': 'string'}}, 'threadId': {'type': 'string', 'description': 'Thread ID to reply to', 'items': None}, 'inReplyTo': {'type': 'string', 'description': 'Message ID being replied to', 'items': None}}, 'required': ['to', 'subject', 'body']}}, {'name': 'draft_email', 'description': 'Draft a new email', 'parameters': {'type': 'object', 'properties': {'to': {'type': 'array', 'description': 'List of recipient email addresses', 'items': {'type': 'string'}}, 'subject': {'type': 'string', 'description': 'Email subject', 'items': None}, 'body': {'type': 'string', 'description': 'Email body content (used for text/plain or when htmlBody not provided)', 'items': None}, 'htmlBody': {'type': 'string', 'description': 'HTML version of the email body', 'items': None}, 'mimeType': {'type': 'string', 'description': 'Email content type', 'items': None}, 'cc': {'type': 'array', 'description': 'List of CC recipients', 'items': {'type': 'string'}}, 'bcc': {'type': 'array', 'description': 'List of BCC recipients', 'items': {'type': 'string'}}, 'threadId': {'type': 'string', 'description': 'Thread ID to reply to', 'items': None}, 'inReplyTo': {'type': 'string', 'description': 'Message ID being replied to', 'items': None}}, 'required': ['to', 'subject', 'body']}}, {'name': 'read_email', 'description': 'Retrieves the content of a specific email', 'parameters': {'type': 'object', 'properties': {'messageId': {'type': 'string', 'description': 'ID of the email message to retrieve', 'items': None}}, 'required': ['messageId']}}, {'name': 'search_emails', 'description': 'Searches for emails using Gmail search syntax', 'parameters': {'type': 'object', 'properties': {'query': {'type': 'string', 'description': \"Gmail search query (e.g., 'from:example@gmail.com')\", 'items': None}, 'maxResults': {'type': 'number', 'description': 'Maximum number of results to return', 'items': None}}, 'required': ['query']}}, {'name': 'modify_email', 'description': 'Modifies email labels (move to different folders)', 'parameters': {'type': 'object', 'properties': {'messageId': {'type': 'string', 'description': 'ID of the email message to modify', 'items': None}, 'addLabelIds': {'type': 'array', 'description': 'List of label IDs to add to the message', 'items': {'type': 'string'}}, 'removeLabelIds': {'type': 'array', 'description': 'List of label IDs to remove from the message', 'items': {'type': 'string'}}}, 'required': ['messageId']}}, {'name': 'delete_email', 'description': 'Permanently deletes an email', 'parameters': {'type': 'object', 'properties': {'messageId': {'type': 'string', 'description': 'ID of the email message to delete', 'items': None}}, 'required': ['messageId']}}, {'name': 'batch_modify_emails', 'description': 'Modifies labels for multiple emails in batches', 'parameters': {'type': 'object', 'properties': {'messageIds': {'type': 'array', 'description': 'List of message IDs to modify', 'items': {'type': 'string'}}, 'addLabelIds': {'type': 'array', 'description': 'List of label IDs to add to all messages', 'items': {'type': 'string'}}, 'removeLabelIds': {'type': 'array', 'description': 'List of label IDs to remove from all messages', 'items': {'type': 'string'}}, 'batchSize': {'type': 'number', 'description': 'Number of messages to process in each batch (default: 50)', 'items': None}}, 'required': ['messageIds']}}, {'name': 'batch_delete_emails', 'description': 'Permanently deletes multiple emails in batches', 'parameters': {'type': 'object', 'properties': {'messageIds': {'type': 'array', 'description': 'List of message IDs to delete', 'items': {'type': 'string'}}, 'batchSize': {'type': 'number', 'description': 'Number of messages to process in each batch (default: 50)', 'items': None}}, 'required': ['messageIds']}}]}] format=<ToolFormat.GEMINI: 'gemini'> error=None\n",
      "\ud83d\udd27 Calling function: send_email\n",
      "OK. I've sent the email.\n",
      "\n"
     ]
    }
   ],
   "source": [
    "EMAIL_RECIPIENT = \"zihaolin@klavis.ai\" # Replace with your email\n",
    "EMAIL_SUBJECT = \"Test Gemini + Gmail MCP Server\"\n",
    "EMAIL_BODY = \"Hello World from Gemini!\"\n",
    "\n",
    "result = gemini_with_mcp_server(\n",
    "    mcp_server_url=gmail_mcp_server.server_url, \n",
    "    user_query=f\"Please send an email to {EMAIL_RECIPIENT} with subject {EMAIL_SUBJECT} and body {EMAIL_BODY}\"\n",
    ")\n",
    "\n",
    "print(result)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "## Summary\n",
    "\n",
    "This tutorial demonstrated how to integrate Google's Gemini with function calling capabilities with Klavis MCP servers to create powerful AI applications. We covered practical examples and interactive features:\n",
    "\n",
    "**\ud83c\udfa5 YouTube Integration**: Built an AI assistant that can automatically summarize YouTube videos by extracting transcripts and providing detailed, timestamped summaries.\n",
    "\n",
    "**\ud83d\udce7 Gmail Integration**: Created an AI-powered email assistant that can send emails through Gmail with OAuth authentication.\n",
    "\n",
    "**\ud83d\udcac Interactive Chat**: Added multi-turn conversation capabilities that maintain context across interactions.\n",
    "\n",
    "### Key Takeaways:\n",
    "- **Modern API**: Uses the latest `google-genai` library with improved type safety and performance\n",
    "- **Easy Setup**: Klavis MCP servers can be created with just a few lines of code\n",
    "- **Robust Function Calling**: Better error handling and response management\n",
    "- **Conversation Context**: Maintains state across multiple interactions\n",
    "- **Versatile**: Support for both simple APIs (YouTube) and OAuth-authenticated services (Gmail)\n",
    "- **Scalable**: The same pattern can be applied to any of the MCP servers available in Klavis\n",
    "- **Developer Friendly**: Enhanced logging and debugging capabilities\n",
    "\n",
    "### Next Steps:\n",
    "- Try different MCP servers from Klavis (Notion, Slack, Airtable, etc.)\n",
    "- Experiment with multi-modal capabilities using images and files\n",
    "- Build more complex workflows with multiple function calls\n",
    "- Integrate with your own applications and use cases\n",
    "\n",
    "**Happy building!** \ud83d\ude80\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}